keywords:"klasifikace dokumentů" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"klasifikace dokumentů"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Using of Data Mining Method for Analysis of Social Networks Novosad, Andrej ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor) Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined. Detailed record
	Analysis of Social Media Content Discussing Czech Mobile Operators Pavlů, Jan ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor) The main topic of this thesis is sentiment analysis of posts obtained from a social networks. The posts are about czech mobile network operators. The essential part of implemented system is also data visualization. The sentiment analysis is done using machine learning techniques. Downloaded posts are cleaned, lemmatized and transformed to feature vectors. Stochastic Gradient Descent algorithm is used for classification. Analyzed data are visualized in charts and as the list of posts. The system provides tools for text categorization. The accuracy, precision, recall and F1 score of sentiment analysis is about 75%. The accuracy of post categorization is high (about 80%), but precision, recall and F1 score are low (about 30%). This is the reason why post categorization isn't automatically done. The benefit of the system it that it automatically collects data from different sources, analysis them and displays them. It also provides tools for manual change of sentiment/categories which can lead to better system characteristics with some help of users. Detailed record
	Sentiment Analysis with Use of Data Mining Sychra, Martin ; Burget, Radek (referee) ; Bartík, Vladimír (advisor) The theme of the work is sentiment analysis, especially in terms of informatics (marginally from a linguistic point of view). The linguistic part discusses the term sentiment and language methods for its analysis, e.g. lemmatization, POS tagging, using the list of stopwords etc. More attention is paid to the structure of the sentiment analyzer which is based on some of the machine learning methods (support vector machines, Naive Bayes and maximum entropy classification). On the basis of the theoretical background, a functional analyzer is projected and implemented. The experiments are focused mainly on comparing the classification methods and on the benefits of using the individual preprocessing methods. The success rate of the constructed classifier reaches up to 84 % in the cross-validation. Detailed record
	Artificial Intelligence Document Classification Molnár, Ondřej ; Kačic, Matej (referee) ; Třeštíková, Lenka (advisor) This paper deals with document classification using artificial intelligence. It describes the principles of classification and machine learning. It also introduces AI methods and presents Naive Bayes classification method in detail. Provides practical implementation of the classifier in MS Office and discusses other possible extensions. Detailed record
	Automated contract classification for portal HlidacSmluv.cz Maroušek, Jakub ; Nečaský, Martin (advisor) ; Holub, Martin (referee) The Contracts Register is a public database containing contracts concluded by public institutions. Due to the number of documents in the database, data analysis is proble- matic. The objective of this thesis is to find a machine learning approach for sorting the contracts into categories by their area of interest (real estate services, construction, etc.) and implement the approach for usage on the web portal Hlídač státu. A large number of categories and a lack of a tagged dataset of contracts complicate the solution. 1 Detailed record
	Artificial Intelligence Document Classification Molnár, Ondřej ; Kačic, Matej (referee) ; Třeštíková, Lenka (advisor) This paper deals with document classification using artificial intelligence. It describes the principles of classification and machine learning. It also introduces AI methods and presents Naive Bayes classification method in detail. Provides practical implementation of the classifier in MS Office and discusses other possible extensions. Detailed record
	Analysis of Social Media Content Discussing Czech Mobile Operators Pavlů, Jan ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor) The main topic of this thesis is sentiment analysis of posts obtained from a social networks. The posts are about czech mobile network operators. The essential part of implemented system is also data visualization. The sentiment analysis is done using machine learning techniques. Downloaded posts are cleaned, lemmatized and transformed to feature vectors. Stochastic Gradient Descent algorithm is used for classification. Analyzed data are visualized in charts and as the list of posts. The system provides tools for text categorization. The accuracy, precision, recall and F1 score of sentiment analysis is about 75%. The accuracy of post categorization is high (about 80%), but precision, recall and F1 score are low (about 30%). This is the reason why post categorization isn't automatically done. The benefit of the system it that it automatically collects data from different sources, analysis them and displays them. It also provides tools for manual change of sentiment/categories which can lead to better system characteristics with some help of users. Detailed record
	Semantic annotations Dědek, Jan ; Vojtáš, Peter (advisor) ; Maynard, Diana (referee) ; Železný, Filip (referee) Four relatively separate topics are presented in the thesis. Each topic represents one particular aspect of the Information Extraction discipline. The first two topics are focused on our information extraction methods based on deep language parsing. The first topic relates to how deep language parsing was used in our extraction method in combination with manually designed extraction rules. The second topic deals with a method for automated induction of extraction rules using Inductive Logic Programming. The third topic of the thesis combines information extraction with rule based reasoning. The core of our extraction method was experimentally reimplemented using semantic web technologies, which allows saving the extraction rules in so called shareable extraction ontologies that are not dependent on the original extraction tool. The last topic of the thesis deals with document classification and fuzzy logic. We are investigating the possibility of using information obtained by information extraction techniques to document classification. Our implementation of so called Fuzzy ILP Classifier was experimentally used for the purpose of document classification. Detailed record
	Sentiment Analysis with Use of Data Mining Sychra, Martin ; Burget, Radek (referee) ; Bartík, Vladimír (advisor) The theme of the work is sentiment analysis, especially in terms of informatics (marginally from a linguistic point of view). The linguistic part discusses the term sentiment and language methods for its analysis, e.g. lemmatization, POS tagging, using the list of stopwords etc. More attention is paid to the structure of the sentiment analyzer which is based on some of the machine learning methods (support vector machines, Naive Bayes and maximum entropy classification). On the basis of the theoretical background, a functional analyzer is projected and implemented. The experiments are focused mainly on comparing the classification methods and on the benefits of using the individual preprocessing methods. The success rate of the constructed classifier reaches up to 84 % in the cross-validation. Detailed record
	Using of Data Mining Method for Analysis of Social Networks Novosad, Andrej ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor) Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English